Latent words recurrent neural network language models

نویسندگان

  • Ryo Masumura
  • Taichi Asami
  • Takanobu Oba
  • Hirokazu Masataki
  • Sumitaka Sakauchi
  • Akinori Ito
چکیده

This paper proposes a novel language modeling approach called latent word recurrent neural network language model, which solves the problems present in both recurrent neural network language models (RNNLMs) and latent word language models (LWLMs). The proposed model has a soft class structure based on a latent variable space as well as LWLM, where the latent variable space is modeled using RNNLM. From the viewpoint of RNNLMs, the proposed model can be considered as a soft class RNNLM with a vast latent variable space. In contrast, from the viewpoint of LWLMs, the proposed model can be considered as an LWLM that uses the RNN structure for latent variable modeling instead of the n-gram structure. This paper also details the parameter inference method and two kinds of usages for natural language processing tasks. Our experiments show effectiveness of the proposed model on a perplexity evaluation for the Penn Treebank corpus and an automatic speech recognition evaluation for Japanese spontaneous speech tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Latent Variable Recurrent Neural Network for Discourse Relation Language Models

This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations that link adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which ...

متن کامل

A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models

This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which ca...

متن کامل

Recurrent Neural Network-Based Semantic Variational Autoencoder for Sequence-to-Sequence Learning

Sequence-to-sequence (Seq2seq) models have played an import role in the recent success of various natural language processing methods, such as machine translation, text summarization, and speech recognition. However, current Seq2seq models have trouble preserving global latent information from a long sequence of words. Variational autoencoder (VAE) alleviates this problem by learning a continuo...

متن کامل

A Linear Dynamical System Model for Text

Low dimensional representations of words allow accurate NLP models to be trained on limited annotated data. While most representations ignore words’ local context, a natural way to induce context-dependent representations is to perform inference in a probabilistic latent-variable sequence model. Given the recent success of continuous vector space word representations, we provide such an inferen...

متن کامل

Recurrent neural network language model adaptation for multi-genre broadcast speech recognition

Recurrent neural network language models (RNNLMs) have recently become increasingly popular for many applications including speech recognition. In previous research RNNLMs have normally been trained on well-matched in-domain data. The adaptation of RNNLMs remains an open research area to be explored. In this paper, genre and topic based RNNLM adaptation techniques are investigated for a multi-g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015